Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 16724 |
| Missing cells | 21450 |
| Missing cells (%) | 8.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 7.3 MiB |
| Average record size in memory | 454.6 B |
Variable types
| Numeric | 10 |
|---|---|
| Categorical | 5 |
| Unsupported | 1 |
name has a high cardinality: 16347 distinct values | High cardinality |
host_name has a high cardinality: 5290 distinct values | High cardinality |
last_review has a high cardinality: 2040 distinct values | High cardinality |
id is highly correlated with host_id | High correlation |
host_id is highly correlated with id | High correlation |
number_of_reviews is highly correlated with reviews_per_month | High correlation |
reviews_per_month is highly correlated with number_of_reviews | High correlation |
number_of_reviews is highly correlated with reviews_per_month | High correlation |
reviews_per_month is highly correlated with number_of_reviews | High correlation |
id is highly correlated with calculated_host_listings_count | High correlation |
host_id is highly correlated with calculated_host_listings_count | High correlation |
latitude is highly correlated with calculated_host_listings_count | High correlation |
longitude is highly correlated with calculated_host_listings_count | High correlation |
price is highly correlated with calculated_host_listings_count | High correlation |
minimum_nights is highly correlated with calculated_host_listings_count | High correlation |
number_of_reviews is highly correlated with reviews_per_month and 1 other fields | High correlation |
reviews_per_month is highly correlated with number_of_reviews and 1 other fields | High correlation |
calculated_host_listings_count is highly correlated with id and 7 other fields | High correlation |
longitude is highly correlated with neighbourhood and 1 other fields | High correlation |
id is highly correlated with host_id | High correlation |
number_of_reviews is highly correlated with reviews_per_month | High correlation |
host_id is highly correlated with id | High correlation |
reviews_per_month is highly correlated with number_of_reviews | High correlation |
neighbourhood is highly correlated with longitude and 1 other fields | High correlation |
latitude is highly correlated with longitude and 1 other fields | High correlation |
neighbourhood_group has 16724 (100.0%) missing values | Missing |
last_review has 2318 (13.9%) missing values | Missing |
reviews_per_month has 2318 (13.9%) missing values | Missing |
price is highly skewed (γ1 = 25.77570208) | Skewed |
minimum_nights is highly skewed (γ1 = 37.94287912) | Skewed |
name is uniformly distributed | Uniform |
id has unique values | Unique |
neighbourhood_group is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
number_of_reviews has 2318 (13.9%) zeros | Zeros |
availability_365 has 10763 (64.4%) zeros | Zeros |
Reproduction
| Analysis started | 2021-08-02 21:17:45.260106 |
|---|---|
| Analysis finished | 2021-08-02 21:18:16.243628 |
| Duration | 30.98 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
| Distinct | 16724 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20974176.06 |
| Minimum | 2818 |
|---|---|
| Maximum | 50807501 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.8 KiB |
Quantile statistics
| Minimum | 2818 |
|---|---|
| 5-th percentile | 1935359.35 |
| Q1 | 10133057.75 |
| median | 19193415 |
| Q3 | 30854498.75 |
| 95-th percentile | 44383565.6 |
| Maximum | 50807501 |
| Range | 50804683 |
| Interquartile range (IQR) | 20721441 |
Descriptive statistics
| Standard deviation | 13275219.55 |
|---|---|
| Coefficient of variation (CV) | 0.6329316355 |
| Kurtosis | -0.8381837558 |
| Mean | 20974176.06 |
| Median Absolute Deviation (MAD) | 9996165 |
| Skewness | 0.3936053076 |
| Sum | 3.507721204 × 1011 |
| Variance | 1.762314542 × 1014 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 2818 | 1 | < 0.1% |
| 25834331 | 1 | < 0.1% |
| 25848655 | 1 | < 0.1% |
| 25853345 | 1 | < 0.1% |
| 25856483 | 1 | < 0.1% |
| 25859560 | 1 | < 0.1% |
| 25875180 | 1 | < 0.1% |
| 25877471 | 1 | < 0.1% |
| 25877705 | 1 | < 0.1% |
| 25879495 | 1 | < 0.1% |
| Other values (16714) | 16714 |
| Value | Count | Frequency (%) |
| 2818 | 1 | |
| 20168 | 1 | |
| 25428 | 1 | |
| 27886 | 1 | |
| 28871 | 1 | |
| 29051 | 1 | |
| 41125 | 1 | |
| 43109 | 1 | |
| 43980 | 1 | |
| 46386 | 1 |
| Value | Count | Frequency (%) |
| 50807501 | 1 | |
| 50806119 | 1 | |
| 50800575 | 1 | |
| 50770121 | 1 | |
| 50770088 | 1 | |
| 50770030 | 1 | |
| 50767325 | 1 | |
| 50765265 | 1 | |
| 50762636 | 1 | |
| 50760230 | 1 |
| Distinct | 16347 |
|---|---|
| Distinct (%) | 97.9% |
| Missing | 31 |
| Missing (%) | 0.2% |
| Memory size | 1.5 MiB |
| Amsterdam | 30 |
|---|---|
| Residences | 2-Bedrooms | Serviced Apartment | 16 |
| Lovely apartment near Vondelpark | 6 |
| Spacious apartment near Vondelpark | 6 |
| Cosy apartment in the city centre of Amsterdam | 5 |
| Other values (16342) |
Length
| Max length | 124 |
|---|---|
| Median length | 40 |
| Mean length | 38.83687773 |
| Min length | 1 |
Characters and Unicode
| Total characters | 648304 |
|---|---|
| Distinct characters | 183 |
| Distinct categories | 19 ? |
| Distinct scripts | 8 ? |
| Distinct blocks | 17 ? |
Unique
| Unique | 16112 ? |
|---|---|
| Unique (%) | 96.5% |
Sample
| 1st row | Quiet Garden View Room & Super Fast WiFi |
|---|---|
| 2nd row | Studio with private bathroom in the centre 1 |
| 3rd row | Lovely, sunny 1 bed apt in Ctr (w.lift) & firepl. |
| 4th row | Romantic, stylish B&B houseboat in canal district |
| 5th row | Comfortable double room |
Common Values
| Value | Count | Frequency (%) |
| Amsterdam | 30 | 0.2% |
| Residences | 2-Bedrooms | Serviced Apartment | 16 | 0.1% |
| Lovely apartment near Vondelpark | 6 | < 0.1% |
| Spacious apartment near Vondelpark | 6 | < 0.1% |
| Cosy apartment in the city centre of Amsterdam | 5 | < 0.1% |
| Amsterdam Appartement | 5 | < 0.1% |
| Cosy apartment in Amsterdam | 5 | < 0.1% |
| Spacious apartment in Amsterdam | 5 | < 0.1% |
| Apartment in Amsterdam | 5 | < 0.1% |
| Spacious apartment with garden | 5 | < 0.1% |
| Other values (16337) | 16605 | |
| (Missing) | 31 | 0.2% |
Length
| Value | Count | Frequency (%) |
| apartment | 6517 | 6.4% |
| in | 5681 | 5.6% |
| amsterdam | 3915 | 3.8% |
| 3205 | 3.2% | |
| with | 2680 | 2.6% |
| the | 2194 | 2.2% |
| spacious | 2039 | 2.0% |
| city | 1726 | 1.7% |
| room | 1598 | 1.6% |
| and | 1597 | 1.6% |
| Other values (4902) | 70562 |
Most occurring characters
| Value | Count | Frequency (%) |
| 85321 | ||
| e | 57245 | 8.8% |
| t | 53063 | 8.2% |
| a | 49614 | 7.7% |
| r | 41409 | 6.4% |
| n | 37577 | 5.8% |
| o | 34670 | 5.3% |
| i | 32259 | 5.0% |
| m | 26031 | 4.0% |
| s | 21075 | 3.3% |
| Other values (173) | 210040 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 493372 | |
| Space Separator | 85336 | 13.2% |
| Uppercase Letter | 49899 | 7.7% |
| Other Punctuation | 9695 | 1.5% |
| Decimal Number | 5295 | 0.8% |
| Dash Punctuation | 1838 | 0.3% |
| Math Symbol | 1098 | 0.2% |
| Close Punctuation | 653 | 0.1% |
| Open Punctuation | 634 | 0.1% |
| Other Symbol | 295 | < 0.1% |
| Other values (9) | 189 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 8494 | |
| C | 5755 | 11.5% |
| S | 4278 | 8.6% |
| B | 3040 | 6.1% |
| L | 2884 | 5.8% |
| P | 2621 | 5.3% |
| R | 2474 | 5.0% |
| E | 1984 | 4.0% |
| T | 1875 | 3.8% |
| N | 1841 | 3.7% |
| Other values (28) | 14653 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 57245 | |
| t | 53063 | |
| a | 49614 | |
| r | 41409 | 8.4% |
| n | 37577 | 7.6% |
| o | 34670 | 7.0% |
| i | 32259 | 6.5% |
| m | 26031 | 5.3% |
| s | 21075 | 4.3% |
| p | 18368 | 3.7% |
| Other values (22) | 122061 |
Other Symbol
| Value | Count | Frequency (%) |
| ★ | 213 | |
| ☆ | 17 | 5.8% |
| ❤ | 10 | 3.4% |
| ♥ | 10 | 3.4% |
| ⭐ | 8 | 2.7% |
| ☀ | 3 | 1.0% |
| 🌳 | 3 | 1.0% |
| ● | 3 | 1.0% |
| ⏹ | 2 | 0.7% |
| ° | 2 | 0.7% |
| Other values (18) | 24 | 8.1% |
Other Letter
| Value | Count | Frequency (%) |
| 年 | 2 | 6.7% |
| ム | 2 | 6.7% |
| 任 | 2 | 6.7% |
| 行 | 2 | 6.7% |
| م | 1 | 3.3% |
| ب | 1 | 3.3% |
| س | 1 | 3.3% |
| و | 1 | 3.3% |
| ط | 1 | 3.3% |
| 末 | 1 | 3.3% |
| Other values (16) | 16 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 2653 | |
| ! | 1939 | |
| & | 1578 | |
| . | 1191 | |
| ' | 763 | 7.9% |
| / | 586 | 6.0% |
| @ | 292 | 3.0% |
| " | 234 | 2.4% |
| : | 225 | 2.3% |
| * | 155 | 1.6% |
| Other values (7) | 79 | 0.8% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1786 | |
| 1 | 927 | |
| 0 | 799 | |
| 5 | 469 | 8.9% |
| 4 | 398 | 7.5% |
| 3 | 392 | 7.4% |
| 6 | 135 | 2.5% |
| 8 | 133 | 2.5% |
| 7 | 130 | 2.5% |
| 9 | 126 | 2.4% |
Math Symbol
| Value | Count | Frequency (%) |
| | | 550 | |
| + | 514 | |
| < | 13 | 1.2% |
| > | 9 | 0.8% |
| ~ | 8 | 0.7% |
| = | 1 | 0.1% |
| → | 1 | 0.1% |
| ≠ | 1 | 0.1% |
| ÷ | 1 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 85321 | ||
| 14 | < 0.1% | |
| 1 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1830 | |
| – | 5 | 0.3% |
| — | 3 | 0.2% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 630 | |
| [ | 4 | 0.6% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 649 | |
| ] | 4 | 0.6% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ️ | 17 | |
| ︎ | 1 | 5.6% |
Initial Punctuation
| Value | Count | Frequency (%) |
| ‘ | 18 | |
| “ | 15 |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 52 | |
| ” | 14 | 21.2% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 1 | |
| € | 1 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 3 | |
| ^ | 1 | 25.0% |
Other Number
| Value | Count | Frequency (%) |
| ² | 18 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 3 |
Control
| Value | Count | Frequency (%) |
| 15 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 543253 | |
| Common | 104985 | 16.2% |
| Inherited | 18 | < 0.1% |
| Cyrillic | 18 | < 0.1% |
| Han | 16 | < 0.1% |
| Katakana | 7 | < 0.1% |
| Arabic | 5 | < 0.1% |
| Hiragana | 2 | < 0.1% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 85321 | ||
| , | 2653 | 2.5% |
| ! | 1939 | 1.8% |
| - | 1830 | 1.7% |
| 2 | 1786 | 1.7% |
| & | 1578 | 1.5% |
| . | 1191 | 1.1% |
| 1 | 927 | 0.9% |
| 0 | 799 | 0.8% |
| ' | 763 | 0.7% |
| Other values (75) | 6198 | 5.9% |
Latin
| Value | Count | Frequency (%) |
| e | 57245 | 10.5% |
| t | 53063 | 9.8% |
| a | 49614 | 9.1% |
| r | 41409 | 7.6% |
| n | 37577 | 6.9% |
| o | 34670 | 6.4% |
| i | 32259 | 5.9% |
| m | 26031 | 4.8% |
| s | 21075 | 3.9% |
| p | 18368 | 3.4% |
| Other values (48) | 171942 |
Han
| Value | Count | Frequency (%) |
| 年 | 2 | |
| 任 | 2 | |
| 行 | 2 | |
| 末 | 1 | 6.2% |
| 始 | 1 | 6.2% |
| 你 | 1 | 6.2% |
| 我 | 1 | 6.2% |
| 两 | 1 | 6.2% |
| 个 | 1 | 6.2% |
| 人 | 1 | 6.2% |
| Other values (3) | 3 |
Cyrillic
| Value | Count | Frequency (%) |
| Е | 4 | |
| А | 2 | |
| М | 2 | |
| О | 2 | |
| м | 1 | 5.6% |
| Н | 1 | 5.6% |
| З | 1 | 5.6% |
| Б | 1 | 5.6% |
| Ы | 1 | 5.6% |
| В | 1 | 5.6% |
| Other values (2) | 2 |
Katakana
| Value | Count | Frequency (%) |
| ム | 2 | |
| ア | 1 | |
| ス | 1 | |
| テ | 1 | |
| ル | 1 | |
| ダ | 1 |
Arabic
| Value | Count | Frequency (%) |
| م | 1 | |
| ب | 1 | |
| س | 1 | |
| و | 1 | |
| ط | 1 |
Inherited
| Value | Count | Frequency (%) |
| ️ | 17 | |
| ︎ | 1 | 5.6% |
Hiragana
| Value | Count | Frequency (%) |
| を | 1 | |
| で | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 647744 | |
| Misc Symbols | 250 | < 0.1% |
| Punctuation | 113 | < 0.1% |
| Latin 1 Sup | 84 | < 0.1% |
| None | 21 | < 0.1% |
| VS | 18 | < 0.1% |
| Cyrillic | 18 | < 0.1% |
| Dingbats | 17 | < 0.1% |
| CJK | 16 | < 0.1% |
| Katakana | 7 | < 0.1% |
| Other values (7) | 16 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 85321 | ||
| e | 57245 | 8.8% |
| t | 53063 | 8.2% |
| a | 49614 | 7.7% |
| r | 41409 | 6.4% |
| n | 37577 | 5.8% |
| o | 34670 | 5.4% |
| i | 32259 | 5.0% |
| m | 26031 | 4.0% |
| s | 21075 | 3.3% |
| Other values (83) | 209480 |
Misc Symbols
| Value | Count | Frequency (%) |
| ★ | 213 | |
| ☆ | 17 | 6.8% |
| ♥ | 10 | 4.0% |
| ☀ | 3 | 1.2% |
| ☘ | 2 | 0.8% |
| ♕ | 2 | 0.8% |
| ♛ | 2 | 0.8% |
| ☼ | 1 | 0.4% |
Latin 1 Sup
| Value | Count | Frequency (%) |
| é | 29 | |
| ² | 18 | |
| 14 | ||
| · | 8 | 9.5% |
| à | 3 | 3.6% |
| É | 3 | 3.6% |
| ° | 2 | 2.4% |
| ó | 2 | 2.4% |
| ü | 2 | 2.4% |
| á | 1 | 1.2% |
| Other values (2) | 2 | 2.4% |
Dingbats
| Value | Count | Frequency (%) |
| ❤ | 10 | |
| ✨ | 2 | 11.8% |
| ❂ | 2 | 11.8% |
| ✭ | 1 | 5.9% |
| ✅ | 1 | 5.9% |
| ✯ | 1 | 5.9% |
VS
| Value | Count | Frequency (%) |
| ️ | 17 | |
| ︎ | 1 | 5.6% |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 52 | |
| ‘ | 18 | 15.9% |
| “ | 15 | 13.3% |
| ” | 14 | 12.4% |
| • | 6 | 5.3% |
| – | 5 | 4.4% |
| — | 3 | 2.7% |
None
| Value | Count | Frequency (%) |
| ⭐ | 8 | |
| 🌳 | 3 | 14.3% |
| 🏡 | 2 | 9.5% |
| 🌱 | 1 | 4.8% |
| 🌟 | 1 | 4.8% |
| 💕 | 1 | 4.8% |
| 1 | 4.8% | |
| 🍀 | 1 | 4.8% |
| 🗝 | 1 | 4.8% |
| 💙 | 1 | 4.8% |
Arabic
| Value | Count | Frequency (%) |
| م | 1 | |
| ب | 1 | |
| س | 1 | |
| و | 1 | |
| ط | 1 |
Misc Technical
| Value | Count | Frequency (%) |
| ⏹ | 2 | |
| ⏺ | 1 |
Geometric Shapes
| Value | Count | Frequency (%) |
| ● | 3 |
Arrows
| Value | Count | Frequency (%) |
| → | 1 |
Currency Symbols
| Value | Count | Frequency (%) |
| € | 1 |
Math Operators
| Value | Count | Frequency (%) |
| ≠ | 1 |
Cyrillic
| Value | Count | Frequency (%) |
| Е | 4 | |
| А | 2 | |
| М | 2 | |
| О | 2 | |
| м | 1 | 5.6% |
| Н | 1 | 5.6% |
| З | 1 | 5.6% |
| Б | 1 | 5.6% |
| Ы | 1 | 5.6% |
| В | 1 | 5.6% |
| Other values (2) | 2 |
CJK
| Value | Count | Frequency (%) |
| 年 | 2 | |
| 任 | 2 | |
| 行 | 2 | |
| 末 | 1 | 6.2% |
| 始 | 1 | 6.2% |
| 你 | 1 | 6.2% |
| 我 | 1 | 6.2% |
| 两 | 1 | 6.2% |
| 个 | 1 | 6.2% |
| 人 | 1 | 6.2% |
| Other values (3) | 3 |
Hiragana
| Value | Count | Frequency (%) |
| を | 1 | |
| で | 1 |
Katakana
| Value | Count | Frequency (%) |
| ム | 2 | |
| ア | 1 | |
| ス | 1 | |
| テ | 1 | |
| ル | 1 | |
| ダ | 1 |
| Distinct | 14654 |
|---|---|
| Distinct (%) | 87.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 69084135.44 |
| Minimum | 3159 |
|---|---|
| Maximum | 410277680 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.8 KiB |
Quantile statistics
| Minimum | 3159 |
|---|---|
| 5-th percentile | 2313043.15 |
| Q1 | 9658875.5 |
| median | 29172584 |
| Q3 | 89079912.25 |
| 95-th percentile | 278726025.1 |
| Maximum | 410277680 |
| Range | 410274521 |
| Interquartile range (IQR) | 79421036.75 |
Descriptive statistics
| Standard deviation | 89489057.11 |
|---|---|
| Coefficient of variation (CV) | 1.29536335 |
| Kurtosis | 2.647025894 |
| Mean | 69084135.44 |
| Median Absolute Deviation (MAD) | 23837373.5 |
| Skewness | 1.81551279 |
| Sum | 1.155363081 × 1012 |
| Variance | 8.008291342 × 1015 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 21800150 | 50 | 0.3% |
| 14874061 | 31 | 0.2% |
| 14574533 | 21 | 0.1% |
| 178187873 | 21 | 0.1% |
| 113977564 | 20 | 0.1% |
| 203731852 | 20 | 0.1% |
| 67005410 | 16 | 0.1% |
| 241644101 | 15 | 0.1% |
| 21167882 | 14 | 0.1% |
| 325213924 | 14 | 0.1% |
| Other values (14644) | 16502 |
| Value | Count | Frequency (%) |
| 3159 | 1 | |
| 3592 | 1 | |
| 7924 | 1 | |
| 12085 | 1 | |
| 47517 | 1 | |
| 49851 | 1 | |
| 56142 | 1 | |
| 57920 | 1 | |
| 58458 | 1 | |
| 59484 | 2 |
| Value | Count | Frequency (%) |
| 410277680 | 1 | < 0.1% |
| 408898089 | 6 | |
| 408480789 | 1 | < 0.1% |
| 408193064 | 1 | < 0.1% |
| 408121917 | 2 | < 0.1% |
| 408082760 | 1 | < 0.1% |
| 407847740 | 1 | < 0.1% |
| 407619058 | 1 | < 0.1% |
| 407601811 | 1 | < 0.1% |
| 407472735 | 2 | < 0.1% |
| Distinct | 5290 |
|---|---|
| Distinct (%) | 31.7% |
| Missing | 59 |
| Missing (%) | 0.4% |
| Memory size | 1.0 MiB |
| Peter | 79 |
|---|---|
| Jasper | 78 |
| Sophie | 72 |
| Jeroen | 72 |
| Martijn | 72 |
| Other values (5285) |
Length
| Max length | 35 |
|---|---|
| Median length | 6 |
| Mean length | 6.371557156 |
| Min length | 1 |
Characters and Unicode
| Total characters | 106182 |
|---|---|
| Distinct characters | 95 |
| Distinct categories | 12 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 5 ? |
Unique
| Unique | 3279 ? |
|---|---|
| Unique (%) | 19.7% |
Sample
| 1st row | Daniel |
|---|---|
| 2nd row | Alexander |
| 3rd row | Joan |
| 4th row | Flip |
| 5th row | Edwin |
Common Values
| Value | Count | Frequency (%) |
| Peter | 79 | 0.5% |
| Jasper | 78 | 0.5% |
| Sophie | 72 | 0.4% |
| Jeroen | 72 | 0.4% |
| Martijn | 72 | 0.4% |
| Thomas | 71 | 0.4% |
| Eva | 69 | 0.4% |
| Anne | 69 | 0.4% |
| Marieke | 66 | 0.4% |
| Joost | 66 | 0.4% |
| Other values (5280) | 15951 |
Length
| Value | Count | Frequency (%) |
| 567 | 3.0% | |
| hotel | 150 | 0.8% |
| and | 128 | 0.7% |
| jan | 101 | 0.5% |
| peter | 93 | 0.5% |
| eva | 88 | 0.5% |
| jasper | 84 | 0.4% |
| en | 83 | 0.4% |
| jeroen | 83 | 0.4% |
| anne | 81 | 0.4% |
| Other values (4857) | 17587 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 12191 | 11.5% |
| a | 11355 | 10.7% |
| i | 8784 | 8.3% |
| n | 8699 | 8.2% |
| r | 6915 | 6.5% |
| o | 5043 | 4.7% |
| l | 4971 | 4.7% |
| t | 4181 | 3.9% |
| s | 3671 | 3.5% |
| 2389 | 2.2% | |
| Other values (85) | 37983 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 83878 | |
| Uppercase Letter | 18938 | 17.8% |
| Space Separator | 2389 | 2.2% |
| Other Punctuation | 744 | 0.7% |
| Dash Punctuation | 138 | 0.1% |
| Decimal Number | 29 | < 0.1% |
| Open Punctuation | 25 | < 0.1% |
| Close Punctuation | 25 | < 0.1% |
| Math Symbol | 8 | < 0.1% |
| Other Symbol | 4 | < 0.1% |
| Other values (2) | 4 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 12191 | |
| a | 11355 | |
| i | 8784 | |
| n | 8699 | |
| r | 6915 | |
| o | 5043 | 6.0% |
| l | 4971 | 5.9% |
| t | 4181 | 5.0% |
| s | 3671 | 4.4% |
| u | 2202 | 2.6% |
| Other values (37) | 15866 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 2360 | 12.5% |
| J | 1740 | 9.2% |
| A | 1595 | 8.4% |
| S | 1506 | 8.0% |
| R | 1135 | 6.0% |
| E | 1099 | 5.8% |
| L | 1037 | 5.5% |
| C | 881 | 4.7% |
| T | 792 | 4.2% |
| B | 787 | 4.2% |
| Other values (19) | 6006 |
Other Punctuation
| Value | Count | Frequency (%) |
| & | 613 | |
| . | 83 | 11.2% |
| , | 21 | 2.8% |
| ' | 14 | 1.9% |
| / | 13 | 1.7% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 12 | |
| 3 | 7 | |
| 0 | 6 | |
| 5 | 3 | 10.3% |
| 2 | 1 | 3.4% |
Other Letter
| Value | Count | Frequency (%) |
| 雨 | 1 | |
| 婷 | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 2389 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 8 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 25 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 25 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 138 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 2 |
Other Symbol
| Value | Count | Frequency (%) |
| ★ | 4 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 102816 | |
| Common | 3364 | 3.2% |
| Han | 2 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 12191 | 11.9% |
| a | 11355 | 11.0% |
| i | 8784 | 8.5% |
| n | 8699 | 8.5% |
| r | 6915 | 6.7% |
| o | 5043 | 4.9% |
| l | 4971 | 4.8% |
| t | 4181 | 4.1% |
| s | 3671 | 3.6% |
| M | 2360 | 2.3% |
| Other values (66) | 34646 |
Common
| Value | Count | Frequency (%) |
| 2389 | ||
| & | 613 | 18.2% |
| - | 138 | 4.1% |
| . | 83 | 2.5% |
| ( | 25 | 0.7% |
| ) | 25 | 0.7% |
| , | 21 | 0.6% |
| ' | 14 | 0.4% |
| / | 13 | 0.4% |
| 1 | 12 | 0.4% |
| Other values (7) | 31 | 0.9% |
Han
| Value | Count | Frequency (%) |
| 雨 | 1 | |
| 婷 | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 105985 | |
| Latin 1 Sup | 183 | 0.2% |
| Latin Ext A | 8 | < 0.1% |
| Misc Symbols | 4 | < 0.1% |
| CJK | 2 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 12191 | 11.5% |
| a | 11355 | 10.7% |
| i | 8784 | 8.3% |
| n | 8699 | 8.2% |
| r | 6915 | 6.5% |
| o | 5043 | 4.8% |
| l | 4971 | 4.7% |
| t | 4181 | 3.9% |
| s | 3671 | 3.5% |
| 2389 | 2.3% | |
| Other values (58) | 37786 |
Latin 1 Sup
| Value | Count | Frequency (%) |
| é | 86 | |
| ë | 31 | 16.9% |
| è | 15 | 8.2% |
| á | 9 | 4.9% |
| í | 9 | 4.9% |
| ï | 6 | 3.3% |
| ç | 6 | 3.3% |
| ö | 4 | 2.2% |
| ü | 3 | 1.6% |
| ó | 3 | 1.6% |
| Other values (11) | 11 | 6.0% |
Latin Ext A
| Value | Count | Frequency (%) |
| ı | 5 | |
| ė | 2 | 25.0% |
| ă | 1 | 12.5% |
CJK
| Value | Count | Frequency (%) |
| 雨 | 1 | |
| 婷 | 1 |
Misc Symbols
| Value | Count | Frequency (%) |
| ★ | 4 |
| Distinct | 22 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| De Baarsjes - Oud-West | |
|---|---|
| De Pijp - Rivierenbuurt | |
| Centrum-West | |
| Centrum-Oost | |
| Zuid | |
| Other values (17) |
Length
| Max length | 38 |
|---|---|
| Median length | 12 |
| Mean length | 15.83765845 |
| Min length | 4 |
Characters and Unicode
| Total characters | 264869 |
|---|---|
| Distinct characters | 41 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Oostelijk Havengebied - Indische Buurt |
|---|---|
| 2nd row | Centrum-Oost |
| 3rd row | Centrum-West |
| 4th row | Centrum-West |
| 5th row | Centrum-West |
Common Values
| Value | Count | Frequency (%) |
| De Baarsjes - Oud-West | 2791 | |
| De Pijp - Rivierenbuurt | 2052 | |
| Centrum-West | 1827 | |
| Centrum-Oost | 1409 | |
| Zuid | 1252 | |
| Westerpark | 1238 | |
| Oud-Oost | 1090 | 6.5% |
| Bos en Lommer | 951 | 5.7% |
| Oostelijk Havengebied - Indische Buurt | 772 | 4.6% |
| Oud-Noord | 532 | 3.2% |
| Other values (12) | 2810 |
Length
| Value | Count | Frequency (%) |
| 6681 | ||
| de | 4967 | |
| baarsjes | 2791 | 7.2% |
| oud-west | 2791 | 7.2% |
| pijp | 2052 | 5.3% |
| rivierenbuurt | 2052 | 5.3% |
| centrum-west | 1827 | 4.7% |
| centrum-oost | 1409 | 3.6% |
| zuid | 1252 | 3.2% |
| westerpark | 1238 | 3.2% |
| Other values (27) | 11563 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 34539 | 13.0% |
| 21899 | 8.3% | |
| r | 20256 | 7.6% |
| s | 18047 | 6.8% |
| t | 17949 | 6.8% |
| u | 16222 | 6.1% |
| - | 15093 | 5.7% |
| i | 10993 | 4.2% |
| a | 10781 | 4.1% |
| d | 9708 | 3.7% |
| Other values (31) | 89382 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 188070 | |
| Uppercase Letter | 39807 | 15.0% |
| Space Separator | 21899 | 8.3% |
| Dash Punctuation | 15093 | 5.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 34539 | |
| r | 20256 | |
| s | 18047 | |
| t | 17949 | |
| u | 16222 | |
| i | 10993 | 5.8% |
| a | 10781 | 5.7% |
| d | 9708 | 5.2% |
| n | 8945 | 4.8% |
| o | 8642 | 4.6% |
| Other values (13) | 31988 |
Uppercase Letter
| Value | Count | Frequency (%) |
| O | 8131 | |
| W | 6691 | |
| D | 5080 | |
| B | 4922 | |
| C | 3332 | |
| P | 2052 | 5.2% |
| R | 2052 | 5.2% |
| Z | 1876 | 4.7% |
| N | 1231 | 3.1% |
| I | 1176 | 3.0% |
| Other values (6) | 3264 |
Space Separator
| Value | Count | Frequency (%) |
| 21899 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 15093 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 227877 | |
| Common | 36992 | 14.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 34539 | |
| r | 20256 | 8.9% |
| s | 18047 | 7.9% |
| t | 17949 | 7.9% |
| u | 16222 | 7.1% |
| i | 10993 | 4.8% |
| a | 10781 | 4.7% |
| d | 9708 | 4.3% |
| n | 8945 | 3.9% |
| o | 8642 | 3.8% |
| Other values (29) | 71795 |
Common
| Value | Count | Frequency (%) |
| 21899 | ||
| - | 15093 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 264869 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 34539 | 13.0% |
| 21899 | 8.3% | |
| r | 20256 | 7.6% |
| s | 18047 | 6.8% |
| t | 17949 | 6.8% |
| u | 16222 | 6.1% |
| - | 15093 | 5.7% |
| i | 10993 | 4.2% |
| a | 10781 | 4.1% |
| d | 9708 | 3.7% |
| Other values (31) | 89382 |
| Distinct | 5829 |
|---|---|
| Distinct (%) | 34.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 52.36545188 |
| Minimum | 52.27882 |
|---|---|
| Maximum | 52.42804 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.8 KiB |
Quantile statistics
| Minimum | 52.27882 |
|---|---|
| 5-th percentile | 52.3423915 |
| Q1 | 52.3551 |
| median | 52.36483 |
| Q3 | 52.37534 |
| 95-th percentile | 52.3914155 |
| Maximum | 52.42804 |
| Range | 0.14922 |
| Interquartile range (IQR) | 0.02024 |
Descriptive statistics
| Standard deviation | 0.01660735033 |
|---|---|
| Coefficient of variation (CV) | 0.0003171432639 |
| Kurtosis | 2.146954534 |
| Mean | 52.36545188 |
| Median Absolute Deviation (MAD) | 0.01003 |
| Skewness | -0.1492107436 |
| Sum | 875759.8173 |
| Variance | 0.0002758040848 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 52.37426 | 23 | 0.1% |
| 52.3643 | 16 | 0.1% |
| 52.37392 | 15 | 0.1% |
| 52.35345 | 14 | 0.1% |
| 52.3566 | 13 | 0.1% |
| 52.3881 | 12 | 0.1% |
| 52.36541 | 12 | 0.1% |
| 52.37211 | 12 | 0.1% |
| 52.36199 | 11 | 0.1% |
| 52.35813 | 11 | 0.1% |
| Other values (5819) | 16585 |
| Value | Count | Frequency (%) |
| 52.27882 | 1 | |
| 52.29034 | 1 | |
| 52.2909 | 1 | |
| 52.29122 | 1 | |
| 52.29125 | 1 | |
| 52.29132 | 1 | |
| 52.29145 | 1 | |
| 52.29158 | 1 | |
| 52.29165 | 1 | |
| 52.2917 | 1 |
| Value | Count | Frequency (%) |
| 52.42804 | 1 | |
| 52.42534 | 1 | |
| 52.42512 | 1 | |
| 52.42482 | 1 | |
| 52.42476 | 1 | |
| 52.42467 | 1 | |
| 52.42447 | 1 | |
| 52.42397 | 1 | |
| 52.42395 | 1 | |
| 52.42379 | 1 |
| Distinct | 9116 |
|---|---|
| Distinct (%) | 54.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.889622679 |
| Minimum | 4.75571 |
|---|---|
| Maximum | 5.065527 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.8 KiB |
Quantile statistics
| Minimum | 4.75571 |
|---|---|
| 5-th percentile | 4.8438275 |
| Q1 | 4.863935 |
| median | 4.88704 |
| Q3 | 4.90941 |
| 95-th percentile | 4.948211 |
| Maximum | 5.065527 |
| Range | 0.309817 |
| Interquartile range (IQR) | 0.045475 |
Descriptive statistics
| Standard deviation | 0.03624438126 |
|---|---|
| Coefficient of variation (CV) | 0.00741251087 |
| Kurtosis | 1.100458643 |
| Mean | 4.889622679 |
| Median Absolute Deviation (MAD) | 0.02277 |
| Skewness | 0.5387269765 |
| Sum | 81774.04969 |
| Variance | 0.001313655173 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4.89926 | 18 | 0.1% |
| 4.91438 | 14 | 0.1% |
| 4.90221 | 14 | 0.1% |
| 4.89635 | 13 | 0.1% |
| 4.88867 | 10 | 0.1% |
| 4.89173 | 9 | 0.1% |
| 4.87251 | 8 | < 0.1% |
| 4.91033 | 8 | < 0.1% |
| 4.87667 | 8 | < 0.1% |
| 4.88857 | 8 | < 0.1% |
| Other values (9106) | 16614 |
| Value | Count | Frequency (%) |
| 4.75571 | 1 | |
| 4.75724 | 1 | |
| 4.76287 | 1 | |
| 4.76961 | 1 | |
| 4.76995 | 1 | |
| 4.77003 | 1 | |
| 4.77139 | 1 | |
| 4.77153 | 1 | |
| 4.77346 | 1 | |
| 4.77507 | 1 |
| Value | Count | Frequency (%) |
| 5.065527 | 1 | |
| 5.02799 | 1 | |
| 5.02643 | 1 | |
| 5.01839 | 1 | |
| 5.01813 | 1 | |
| 5.01685 | 1 | |
| 5.01667 | 1 | |
| 5.01578 | 1 | |
| 5.01478 | 1 | |
| 5.01471 | 1 |
room_type
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| Entire home/apt | |
|---|---|
| Private room | |
| Hotel room | 116 |
| Shared room | 47 |
Length
| Max length | 15 |
|---|---|
| Median length | 15 |
| Mean length | 14.31386032 |
| Min length | 10 |
Characters and Unicode
| Total characters | 239385 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Private room |
|---|---|
| 2nd row | Private room |
| 3rd row | Entire home/apt |
| 4th row | Private room |
| 5th row | Private room |
Common Values
| Value | Count | Frequency (%) |
| Entire home/apt | 12992 | |
| Private room | 3569 | 21.3% |
| Hotel room | 116 | 0.7% |
| Shared room | 47 | 0.3% |
Length
Pie chart
| Value | Count | Frequency (%) |
| entire | 12992 | |
| home/apt | 12992 | |
| room | 3732 | 11.2% |
| private | 3569 | 10.7% |
| hotel | 116 | 0.3% |
| shared | 47 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 29716 | |
| t | 29669 | |
| o | 20572 | |
| r | 20340 | |
| 16724 | 7.0% | |
| m | 16724 | 7.0% |
| a | 16608 | 6.9% |
| i | 16561 | 6.9% |
| h | 13039 | 5.4% |
| E | 12992 | 5.4% |
| Other values (9) | 46440 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 192945 | |
| Uppercase Letter | 16724 | 7.0% |
| Space Separator | 16724 | 7.0% |
| Other Punctuation | 12992 | 5.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 29716 | |
| t | 29669 | |
| o | 20572 | |
| r | 20340 | |
| m | 16724 | |
| a | 16608 | |
| i | 16561 | |
| h | 13039 | |
| n | 12992 | |
| p | 12992 | |
| Other values (3) | 3732 | 1.9% |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 12992 | |
| P | 3569 | 21.3% |
| H | 116 | 0.7% |
| S | 47 | 0.3% |
Space Separator
| Value | Count | Frequency (%) |
| 16724 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 12992 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 209669 | |
| Common | 29716 | 12.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 29716 | |
| t | 29669 | |
| o | 20572 | |
| r | 20340 | |
| m | 16724 | |
| a | 16608 | |
| i | 16561 | |
| h | 13039 | |
| E | 12992 | |
| n | 12992 | |
| Other values (7) | 20456 |
Common
| Value | Count | Frequency (%) |
| 16724 | ||
| / | 12992 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 239385 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 29716 | |
| t | 29669 | |
| o | 20572 | |
| r | 20340 | |
| 16724 | 7.0% | |
| m | 16724 | 7.0% |
| a | 16608 | 6.9% |
| i | 16561 | 6.9% |
| h | 13039 | 5.4% |
| E | 12992 | 5.4% |
| Other values (9) | 46440 |
| Distinct | 509 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 156.1597704 |
| Minimum | 0 |
|---|---|
| Maximum | 8000 |
| Zeros | 16 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 57 |
| Q1 | 95 |
| median | 129 |
| Q3 | 180 |
| 95-th percentile | 320 |
| Maximum | 8000 |
| Range | 8000 |
| Interquartile range (IQR) | 85 |
Descriptive statistics
| Standard deviation | 172.4871696 |
|---|---|
| Coefficient of variation (CV) | 1.104555732 |
| Kurtosis | 1033.124837 |
| Mean | 156.1597704 |
| Median Absolute Deviation (MAD) | 40 |
| Skewness | 25.77570208 |
| Sum | 2611616 |
| Variance | 29751.82366 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 150 | 881 | 5.3% |
| 100 | 737 | 4.4% |
| 120 | 617 | 3.7% |
| 200 | 535 | 3.2% |
| 125 | 479 | 2.9% |
| 110 | 414 | 2.5% |
| 90 | 386 | 2.3% |
| 250 | 386 | 2.3% |
| 80 | 376 | 2.2% |
| 130 | 357 | 2.1% |
| Other values (499) | 11556 |
| Value | Count | Frequency (%) |
| 0 | 16 | |
| 4 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| 10 | 1 | < 0.1% |
| 16 | 1 | < 0.1% |
| 17 | 2 | < 0.1% |
| 18 | 1 | < 0.1% |
| 19 | 2 | < 0.1% |
| 20 | 8 | |
| 21 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 8000 | 2 | |
| 7999 | 1 | < 0.1% |
| 6477 | 2 | |
| 5000 | 1 | < 0.1% |
| 3000 | 1 | < 0.1% |
| 2500 | 2 | |
| 2000 | 3 | |
| 1925 | 1 | < 0.1% |
| 1750 | 1 | < 0.1% |
| 1511 | 1 | < 0.1% |
| Distinct | 63 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.009208323 |
| Minimum | 1 |
|---|---|
| Maximum | 1100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 7 |
| Maximum | 1100 |
| Range | 1099 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 19.76198833 |
|---|---|
| Coefficient of variation (CV) | 4.92914978 |
| Kurtosis | 1802.352945 |
| Mean | 4.009208323 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 37.94287912 |
| Sum | 67050 |
| Variance | 390.5361826 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 6207 | |
| 3 | 4027 | |
| 1 | 2783 | |
| 4 | 1403 | 8.4% |
| 5 | 928 | 5.5% |
| 7 | 517 | 3.1% |
| 6 | 235 | 1.4% |
| 14 | 112 | 0.7% |
| 10 | 104 | 0.6% |
| 30 | 76 | 0.5% |
| Other values (53) | 332 | 2.0% |
| Value | Count | Frequency (%) |
| 1 | 2783 | |
| 2 | 6207 | |
| 3 | 4027 | |
| 4 | 1403 | 8.4% |
| 5 | 928 | 5.5% |
| 6 | 235 | 1.4% |
| 7 | 517 | 3.1% |
| 8 | 35 | 0.2% |
| 9 | 13 | 0.1% |
| 10 | 104 | 0.6% |
| Value | Count | Frequency (%) |
| 1100 | 1 | < 0.1% |
| 1001 | 1 | < 0.1% |
| 1000 | 1 | < 0.1% |
| 999 | 1 | < 0.1% |
| 500 | 1 | < 0.1% |
| 365 | 6 | |
| 300 | 3 | |
| 240 | 1 | < 0.1% |
| 222 | 1 | < 0.1% |
| 200 | 3 |
number_of_reviews
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 409 |
|---|---|
| Distinct (%) | 2.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24.10368333 |
| Minimum | 0 |
|---|---|
| Maximum | 865 |
| Zeros | 2318 |
| Zeros (%) | 13.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 8 |
| Q3 | 21 |
| 95-th percentile | 101 |
| Maximum | 865 |
| Range | 865 |
| Interquartile range (IQR) | 19 |
Descriptive statistics
| Standard deviation | 55.37988933 |
|---|---|
| Coefficient of variation (CV) | 2.297569569 |
| Kurtosis | 43.8654417 |
| Mean | 24.10368333 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 5.67167682 |
| Sum | 403110 |
| Variance | 3066.932142 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2318 | 13.9% |
| 1 | 1376 | 8.2% |
| 2 | 1058 | 6.3% |
| 3 | 920 | 5.5% |
| 4 | 769 | 4.6% |
| 5 | 660 | 3.9% |
| 6 | 620 | 3.7% |
| 7 | 555 | 3.3% |
| 8 | 493 | 2.9% |
| 10 | 486 | 2.9% |
| Other values (399) | 7469 |
| Value | Count | Frequency (%) |
| 0 | 2318 | |
| 1 | 1376 | |
| 2 | 1058 | |
| 3 | 920 | 5.5% |
| 4 | 769 | 4.6% |
| 5 | 660 | 3.9% |
| 6 | 620 | 3.7% |
| 7 | 555 | 3.3% |
| 8 | 493 | 2.9% |
| 9 | 430 | 2.6% |
| Value | Count | Frequency (%) |
| 865 | 1 | |
| 838 | 1 | |
| 790 | 1 | |
| 778 | 1 | |
| 713 | 1 | |
| 711 | 1 | |
| 704 | 1 | |
| 696 | 1 | |
| 683 | 1 | |
| 622 | 1 |
| Distinct | 2040 |
|---|---|
| Distinct (%) | 14.2% |
| Missing | 2318 |
| Missing (%) | 13.9% |
| Memory size | 1015.1 KiB |
| 2020-01-02 | 118 |
|---|---|
| 2019-10-20 | 90 |
| 2020-01-03 | 85 |
| 2020-03-08 | 85 |
| 2020-02-16 | 84 |
| Other values (2035) |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 144060 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 400 ? |
|---|---|
| Unique (%) | 2.8% |
Sample
| 1st row | 2019-11-21 |
|---|---|
| 2nd row | 2020-03-27 |
| 3rd row | 2020-01-02 |
| 4th row | 2020-07-25 |
| 5th row | 2020-02-06 |
Common Values
| Value | Count | Frequency (%) |
| 2020-01-02 | 118 | 0.7% |
| 2019-10-20 | 90 | 0.5% |
| 2020-01-03 | 85 | 0.5% |
| 2020-03-08 | 85 | 0.5% |
| 2020-02-16 | 84 | 0.5% |
| 2019-10-21 | 77 | 0.5% |
| 2020-01-01 | 77 | 0.5% |
| 2019-09-15 | 68 | 0.4% |
| 2019-09-16 | 62 | 0.4% |
| 2020-02-17 | 52 | 0.3% |
| Other values (2030) | 13608 | |
| (Missing) | 2318 | 13.9% |
Length
| Value | Count | Frequency (%) |
| 2020-01-02 | 118 | 0.8% |
| 2019-10-20 | 90 | 0.6% |
| 2020-01-03 | 85 | 0.6% |
| 2020-03-08 | 85 | 0.6% |
| 2020-02-16 | 84 | 0.6% |
| 2020-01-01 | 77 | 0.5% |
| 2019-10-21 | 77 | 0.5% |
| 2019-09-15 | 68 | 0.5% |
| 2019-09-16 | 62 | 0.4% |
| 2019-12-30 | 52 | 0.4% |
| Other values (2030) | 13608 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 35638 | |
| - | 28812 | |
| 2 | 25519 | |
| 1 | 23245 | |
| 9 | 7957 | 5.5% |
| 8 | 5716 | 4.0% |
| 7 | 4889 | 3.4% |
| 6 | 3904 | 2.7% |
| 3 | 3263 | 2.3% |
| 5 | 2832 | 2.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 115248 | |
| Dash Punctuation | 28812 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 35638 | |
| 2 | 25519 | |
| 1 | 23245 | |
| 9 | 7957 | 6.9% |
| 8 | 5716 | 5.0% |
| 7 | 4889 | 4.2% |
| 6 | 3904 | 3.4% |
| 3 | 3263 | 2.8% |
| 5 | 2832 | 2.5% |
| 4 | 2285 | 2.0% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 28812 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 144060 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 35638 | |
| - | 28812 | |
| 2 | 25519 | |
| 1 | 23245 | |
| 9 | 7957 | 5.5% |
| 8 | 5716 | 4.0% |
| 7 | 4889 | 3.4% |
| 6 | 3904 | 2.7% |
| 3 | 3263 | 2.3% |
| 5 | 2832 | 2.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 144060 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 35638 | |
| - | 28812 | |
| 2 | 25519 | |
| 1 | 23245 | |
| 9 | 7957 | 5.5% |
| 8 | 5716 | 4.0% |
| 7 | 4889 | 3.4% |
| 6 | 3904 | 2.7% |
| 3 | 3263 | 2.3% |
| 5 | 2832 | 2.0% |
reviews_per_month
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONMISSING| Distinct | 596 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 2318 |
| Missing (%) | 13.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.6094911842 |
| Minimum | 0.01 |
|---|---|
| Maximum | 30 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.8 KiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 0.03 |
| Q1 | 0.11 |
| median | 0.27 |
| Q3 | 0.6 |
| 95-th percentile | 2.69 |
| Maximum | 30 |
| Range | 29.99 |
| Interquartile range (IQR) | 0.49 |
Descriptive statistics
| Standard deviation | 1.122150628 |
|---|---|
| Coefficient of variation (CV) | 1.841126922 |
| Kurtosis | 119.2612226 |
| Mean | 0.6094911842 |
| Median Absolute Deviation (MAD) | 0.19 |
| Skewness | 7.423858119 |
| Sum | 8780.33 |
| Variance | 1.259222032 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.02 | 442 | 2.6% |
| 0.05 | 435 | 2.6% |
| 0.03 | 397 | 2.4% |
| 0.04 | 378 | 2.3% |
| 0.06 | 359 | 2.1% |
| 0.09 | 340 | 2.0% |
| 0.08 | 333 | 2.0% |
| 0.1 | 318 | 1.9% |
| 0.11 | 304 | 1.8% |
| 0.13 | 297 | 1.8% |
| Other values (586) | 10803 | |
| (Missing) | 2318 | 13.9% |
| Value | Count | Frequency (%) |
| 0.01 | 133 | 0.8% |
| 0.02 | 442 | |
| 0.03 | 397 | |
| 0.04 | 378 | |
| 0.05 | 435 | |
| 0.06 | 359 | |
| 0.07 | 270 | |
| 0.08 | 333 | |
| 0.09 | 340 | |
| 0.1 | 318 |
| Value | Count | Frequency (%) |
| 30 | 1 | |
| 29.02 | 1 | |
| 28.31 | 1 | |
| 21 | 1 | |
| 18.58 | 1 | |
| 17.48 | 1 | |
| 16 | 1 | |
| 14.79 | 1 | |
| 14.71 | 1 | |
| 13 | 1 |
| Distinct | 20 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.801124133 |
| Minimum | 1 |
|---|---|
| Maximum | 50 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 5 |
| Maximum | 50 |
| Range | 49 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 3.611364083 |
|---|---|
| Coefficient of variation (CV) | 2.005061182 |
| Kurtosis | 104.5765644 |
| Mean | 1.801124133 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 9.189609196 |
| Sum | 30122 |
| Variance | 13.04195054 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 13507 | |
| 2 | 1696 | 10.1% |
| 3 | 444 | 2.7% |
| 4 | 176 | 1.1% |
| 6 | 162 | 1.0% |
| 5 | 135 | 0.8% |
| 7 | 84 | 0.5% |
| 8 | 80 | 0.5% |
| 9 | 63 | 0.4% |
| 50 | 50 | 0.3% |
| Other values (10) | 327 | 2.0% |
| Value | Count | Frequency (%) |
| 1 | 13507 | |
| 2 | 1696 | 10.1% |
| 3 | 444 | 2.7% |
| 4 | 176 | 1.1% |
| 5 | 135 | 0.8% |
| 6 | 162 | 1.0% |
| 7 | 84 | 0.5% |
| 8 | 80 | 0.5% |
| 9 | 63 | 0.4% |
| 10 | 50 | 0.3% |
| Value | Count | Frequency (%) |
| 50 | 50 | |
| 31 | 31 | |
| 21 | 42 | |
| 20 | 40 | |
| 16 | 16 | 0.1% |
| 15 | 15 | 0.1% |
| 14 | 28 | |
| 13 | 13 | 0.1% |
| 12 | 48 | |
| 11 | 44 |
| Distinct | 363 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 60.5230806 |
| Minimum | 0 |
|---|---|
| Maximum | 365 |
| Zeros | 10763 |
| Zeros (%) | 64.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 71 |
| 95-th percentile | 354 |
| Maximum | 365 |
| Range | 365 |
| Interquartile range (IQR) | 71 |
Descriptive statistics
| Standard deviation | 111.861502 |
|---|---|
| Coefficient of variation (CV) | 1.848245346 |
| Kurtosis | 1.627264063 |
| Mean | 60.5230806 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.755180469 |
| Sum | 1012188 |
| Variance | 12512.99564 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 10763 | |
| 365 | 272 | 1.6% |
| 364 | 173 | 1.0% |
| 89 | 131 | 0.8% |
| 90 | 115 | 0.7% |
| 88 | 97 | 0.6% |
| 2 | 85 | 0.5% |
| 363 | 84 | 0.5% |
| 180 | 76 | 0.5% |
| 185 | 76 | 0.5% |
| Other values (353) | 4852 |
| Value | Count | Frequency (%) |
| 0 | 10763 | |
| 1 | 75 | 0.4% |
| 2 | 85 | 0.5% |
| 3 | 51 | 0.3% |
| 4 | 42 | 0.3% |
| 5 | 44 | 0.3% |
| 6 | 43 | 0.3% |
| 7 | 51 | 0.3% |
| 8 | 44 | 0.3% |
| 9 | 35 | 0.2% |
| Value | Count | Frequency (%) |
| 365 | 272 | |
| 364 | 173 | |
| 363 | 84 | 0.5% |
| 362 | 64 | 0.4% |
| 361 | 35 | 0.2% |
| 360 | 41 | 0.2% |
| 359 | 28 | 0.2% |
| 358 | 58 | 0.3% |
| 357 | 23 | 0.1% |
| 356 | 15 | 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2818 | Quiet Garden View Room & Super Fast WiFi | 3159 | Daniel | NaN | Oostelijk Havengebied - Indische Buurt | 52.36435 | 4.94358 | Private room | 60 | 3 | 278 | 2019-11-21 | 2.90 | 1 | 137 |
| 1 | 20168 | Studio with private bathroom in the centre 1 | 59484 | Alexander | NaN | Centrum-Oost | 52.36407 | 4.89393 | Private room | 106 | 1 | 339 | 2020-03-27 | 3.73 | 2 | 0 |
| 2 | 25428 | Lovely, sunny 1 bed apt in Ctr (w.lift) & firepl. | 56142 | Joan | NaN | Centrum-West | 52.37490 | 4.88487 | Entire home/apt | 100 | 14 | 5 | 2020-01-02 | 0.12 | 1 | 58 |
| 3 | 27886 | Romantic, stylish B&B houseboat in canal district | 97647 | Flip | NaN | Centrum-West | 52.38761 | 4.89188 | Private room | 138 | 2 | 221 | 2020-07-25 | 2.17 | 1 | 217 |
| 4 | 28871 | Comfortable double room | 124245 | Edwin | NaN | Centrum-West | 52.36775 | 4.89092 | Private room | 75 | 2 | 338 | 2020-02-06 | 4.52 | 2 | 298 |
| 5 | 29051 | Comfortable single room | 124245 | Edwin | NaN | Centrum-Oost | 52.36584 | 4.89111 | Private room | 55 | 2 | 480 | 2020-02-26 | 5.44 | 2 | 318 |
| 6 | 41125 | Amsterdam Center Entire Apartment | 178515 | Fatih | NaN | Centrum-West | 52.37920 | 4.88432 | Entire home/apt | 160 | 4 | 89 | 2020-02-10 | 0.76 | 1 | 0 |
| 7 | 43109 | Oasis in the middle of Amsterdam | 188098 | Aukje | NaN | Centrum-West | 52.37366 | 4.88808 | Entire home/apt | 211 | 3 | 59 | 2019-04-16 | 0.56 | 1 | 365 |
| 8 | 43980 | View into park / museum district (long/short stay) | 65041 | Ym | NaN | Zuid | 52.35735 | 4.86158 | Entire home/apt | 67 | 7 | 61 | 2015-04-23 | 0.53 | 2 | 0 |
| 9 | 46386 | Cozy loft in central Amsterdam | 207342 | Joost | NaN | De Pijp - Rivierenbuurt | 52.35198 | 4.90746 | Entire home/apt | 150 | 3 | 3 | 2018-01-03 | 0.03 | 1 | 0 |
Last rows
| id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 16714 | 50760230 | Luxury appartment in the heart of the Oud-West | 11227766 | Florentine | NaN | De Baarsjes - Oud-West | 52.364980 | 4.862352 | Entire home/apt | 127 | 14 | 0 | NaN | NaN | 1 | 19 |
| 16715 | 50762636 | Double/Triple with private bathroom in a homestay | 311743899 | Hilletje | NaN | Bos en Lommer | 52.382099 | 4.856841 | Private room | 76 | 2 | 0 | NaN | NaN | 1 | 336 |
| 16716 | 50765265 | Private 55m2 apartment next to Rijksmuseum (500m) | 101998763 | Joost | NaN | Zuid | 52.353967 | 4.886547 | Entire home/apt | 85 | 1 | 0 | NaN | NaN | 1 | 164 |
| 16717 | 50767325 | lovely studio near center of Amsterdam | 410277680 | Davoud | NaN | Noord-Oost | 52.399251 | 4.938890 | Entire home/apt | 90 | 1 | 0 | NaN | NaN | 1 | 187 |
| 16718 | 50770030 | Spacious room with balcony in lively neighbourhood | 29687232 | Niels | NaN | De Baarsjes - Oud-West | 52.364852 | 4.863416 | Private room | 55 | 1 | 0 | NaN | NaN | 1 | 76 |
| 16719 | 50770088 | Van Westhouse 3 City centre Canal house studio | 19228755 | Thedie | NaN | Centrum-Oost | 52.372536 | 4.902582 | Private room | 99 | 5 | 0 | NaN | NaN | 3 | 256 |
| 16720 | 50770121 | Sfeervol en warm appartement * A’dam Oud-West * | 250631796 | Danny | NaN | De Baarsjes - Oud-West | 52.372001 | 4.872452 | Entire home/apt | 61 | 3 | 0 | NaN | NaN | 1 | 191 |
| 16721 | 50800575 | leuke plek voor 2 in de pijp | 41488364 | Bart | NaN | De Pijp - Rivierenbuurt | 52.351809 | 4.906918 | Entire home/apt | 80 | 1 | 0 | NaN | NaN | 1 | 356 |
| 16722 | 50806119 | Spacious and Light apartment in West of Amsterdam! | 14289831 | Hein | NaN | Bos en Lommer | 52.379064 | 4.858107 | Entire home/apt | 125 | 7 | 0 | NaN | NaN | 1 | 26 |
| 16723 | 50807501 | Cozy house with the garden | 19645752 | Andrei | NaN | Slotervaart | 52.367155 | 4.838638 | Entire home/apt | 120 | 5 | 0 | NaN | NaN | 1 | 24 |